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Abstract 


This  thesis  describes  an  interactive  text  editor  which 
stores,  displays  and  modifies  a  program’s  parse  tree  using  a 
CRT  display  terminal.  The  programmer  fcuilds  the  parse  tree 
of  his  program  top-down,  by  selecting  valid  producticns  of 
the  grammar  with  a  light  pen.  This  method  makes  it 
impossible  to  construct  a  program  which  is  syntactically 
incorrect. 

The  usage  of  the  editor  is  briefly  described,  followed 
by  a  description  cf  the  algorithms  and  data  structures  used 
to  implement  it. 
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1  Introduction 


A  compiler  usually  regards  a  program  as  a  linear  string 
of  symbols.  A  programmer,  on  the  other  hand,  tends  to  regard 
his  program  as  a-  hierarchy  of  sub-pregrams  and  primitive 
structures  (e.g.  control  structures,  executable  statements 
and  expressions) «  The  programmer  seldom,  if  ever,  composes 
his  program  in  the  same  order  that  it  is  scanned  by  the 
compiler.  When  the  time  comes  to  make  changes,  the 
programmer  is  faced  with  a  similar  problem.  His  computer 
generated  listings  again  show  the  program  as  a  linear  list 
of  symbols.  What  is  needed  is  a  text  editor  which  stores  and 
modifies  a  program  based  on  its  hierarchical  structure. 

Many  programming  languages  are  defined  syntactically 
with  the  aid  of  a  context-free  grammar.  Ir.  most  of  these 
languages  (e.g.  ALGC1-60)  the  parse  tree  of  the  program 
mirrors  the  hierarchical  structure  visualized  by  the 
prog  r  amir  er . 

In  the  following  chapters,  we  describe  a  text  editor 
capable  cf  storing,  modifying  and  displaying  a  program's 
parse  tree  using  an  interactive  graphics  display  terminal. 
This  program  has  been  implemented  for  use  with  the  Project 
SUB  System  language  [Clark  71a, 71b].  Written  in  PI/I  [IEM 
73],  this  editor  is  in  operation  on  an  IEM  360  model 
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usirg  a  2250  terminal  (CRT  with  -light  per:}  as  the  user 
in  ter. f  ace  . 


1.1  Related  Kcrk  in  this  Area 

Carrercn  et.  al,  have  implemented  a  programming  system 
which  makes  syntactically  invalid  entries  impossible 
[Cameron  67].  This  system  uses  a  special  graphics  display 
which  does  not  have  a  keyboard.  The-  programmer  "types"  his 
messages  by  choosing  symbols  frcm  the  display  screen  with  a 
light  pen,  one  at  a  time.  Only  those  symbols  which  are 
syntactically  valid  as  responses  will  be  displayed. 

Eratman  et.  a 1 .  have  implemented  a  syntax-checking 
editor  for  the  JOVIAL  programming  language  [Eratman  681. 
Their  editor  uses  a  keyboard  for  program  entry  and  syntax 
checking  is  only  done  on  a  local  (line  by  line)  basis.  They 
maintain  a  symbol  table  which  allows  the  programmer  to 
quickly  locate  all  references  to  a  given  identifier.  It  also 
allows  them  to  flag  identifiers  which  are  never  assigned 
values.  They  do  seme  primitive  global  syntax  checking  (e.g. 
verifying  that  there  are  an  equal  cumber  cf  BEGIN  and  END 
symbols) . 


Hansen  lias  implemented  a  system  similiar  to  curs  which 
he  calls  EMILY  [fiansen  7  1a  and  b  ].  EKIIY  is  more  general  and 
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of  hierarchically  related  text, 


c a  n  h  a  n  <3 1  e  ary  typ e 


vie 


restrict  ourselves  to  the  dcirain  of  a  specific  prog  r  airir  ing 
language.  However,  this  restriction  allows  us  to  add  several 
novel  and  useful  features,  such  as  the  semantic  checker. 


Robinson  and  Pumas  describe  a  set  cf  functions  fcr 
storing  and  irar  ipula  ti  ng  a  program  in  parse  tree  fora 
[Pobinscn  73].  Their  mechanism  has  been  used  to  implement  a 
syntax  driven  text  editor. 


1.2  JErief  Pescription  of  Editor 

The  user  of  our  editor  does  net  enter  his  program 
through  a  keyboard.  Instead,  he  builds  -the  parse  tree  cf  his 
program,  top  down,  by  selecting  productions  c£  the  grammar 
with  a  light  pen. 


There  is  a  lipdpctipn  associated  with  every 
non- terminal  symbol  in  the  language.  Each  production  may 
have  one  or  more  §ltei n atiyes  associated  with  it.  Each 
alternative  is  a  string  of  zero  or  more  symbols  which  may  be 
either  terminal  or  non-terminal. 

The  user  begins  with  his  parse  tree  initialized  to  the 
goal  symbol  (a  nen-terre  inal)  »  He  expands  his  parse  tree  by 
selecting  a  non-terminal  symbol  with  the  light  pen.  If  the 
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corresponding  production  has  only  one  alternative  associated 
with  it,  the  syirbcls  of  that  alternative  are  immediatly 
inserted  in  the  parse  tree  below  the  selected  non-tern  inal. 
If  the  production  has  two  or  more  alternatives  associated 
with  it,  the  user  is  asked  to  select  the  desired  alternative 
with  the  light  pen.  The  symbols  of  the  selected  alternative 
are  similarly  inserted  in  the  parse  tree.  In  both  cases  the 
display  is  updated  to  shew  the  new  text  in  place  of  the 
selected  ncr.-  tern  inal.  As  an  example,  consider  the  following 
grammar : 

<LIST>  ::=  I 

|  <IIS!>  ,  I 

The  display  initially  shows  the  goal  symbol: 

<LIST> 

Parse  Tree 

The  user  selects  the  second  alternative  cf  <IIST>: 

<LIST> 

I 

V 

<L  ISI  >  --->  1  ,  «  -  — >  1  T  * 

Parse  Tree 


1  <  1 1 S I  > ,  I  | 

I  I 

Display 


,  t 

i  ! 

}  <  1 1 S  T  >  j 

\  I 

j  —  —  | 

Display 
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The  user  now  selects  the  first  alternative : 

<  L I S I  > 

I 

V 

< L I S I >  —  >  »  ,  *  --->  « I  * 

I 

V 

f  T  I 

Parse  Tree 

The  parse  tree  is  new  complete  since  re  non- terminals 
remain . 

1.2.1  Tree  Map ipu lat ic n  Functions 

The  editor  provides  the  following  fcasic  tree 

manipulation  functions  to  the  user; 

1)  Insertion  -  A  non  terminal  syrrlcl  is  expanded,  as  in 
the  above  example. 

2)  D-lflicr.  ~  The  descendant  subtree  of  a  non- terminal 
symfcol  is  deleted. 

2)  Copy  -  Ihe  descendant  subtree  of  a  nor; -terminal  is 
duplicated  below  a  second  non- t er mi nai  of  the  same 
type.  The  copied  text  may  be  optionally  saved  in  a  work 
area. 

1.2.2  bisplay  Mppipu ip t i on  Functions 


1 1 1 

:  is  pi  ay 


. 


The  following  display  manipulation  functions  are  also 
provided  ; 

1)  Contraction  -  The  terminal  and  non- term  inal  syrrbcls 
of  any  production,  and  those  of  all  its  descendants, 
are  deleted  from  the  display  and  are  replaced  ly  the 
ncn-terirrra  1  symbol  that  is  the  ancestor  of  the 
production.  This  is  the  symbol  that  appears  to  the  left 
of  the  1 : := ’  meta-symbol  in  BNP  notation. 

2)  bxjaansicr  -  A  non- terminal  symbol  is  replaced  by  the 
terminal  ard  non-terminal  symbols  of  its  descendant 
production.  This  production  must  exist  in  the  parse 
tree,  that  is  tc  say  it  must  have  been  previously 
ccntr acted. 

2)  Jaqe  Flipping  -  The  display  is  divided  into  pages  cf 
40  lines  each.  The  user  may  flip  through  the  pages  cf 
his  listing,  either  backwards  cr  forwards.  Cue  to  the 
short  amount  of  time  reguired  to  generate  a  page  of  the 
display  (about  0.5  seconds),  hard  copy  listings  are  net 
needed. 


It  should  be  noted  that  while  the  tree  manipu 1 aticn 
functions  change  the  display,  the  display  manipulation 
functions  never  modify  the  parse  tree. 
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3  ether  Features 


In  addition  to  the  functions  described  above,  our 
editor  also  provide  the  following  components: 

1)  Par  adr  a_pher  ~  special  terminal  symbols  may  be 
included  in  the  grammar  to  centre!  paragraphing  of  the 
display. 

2)  Semantic  Checker  -  a  limited  amount  of  semantic 
.  checking  may  be  performed  on  the  parse  tree,  or  on  any 

subtree.  In  our  current  impleme nt at ic r  we  only  check 
for  undeclared  identifiers,  and  fer  seme  uses  of 
identifiers  which  conflict  with  their  declared  type. 


1.3  Advantages  cf  this  Method 

1.3.1  Flimina tier  of  Syntax  Errors 

Since  the  programmer  expands  his  parse  tree  by  choosing 
from  only  syntactically  valid  productions,  it  is  impossible 
to  construct  a  program  which  contains  syntax  errors.  Since 
all  keywords  and  punctuation  are  inserted  automatically, 
problems  such  as  unbalanced  parentheses  beccire  non-existent. 
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1.3*2  Program  Viewing  in  Z wo  Dimensions 


The 

page  flipping  facility 

allows 

the 

user 

tc  scan  his 

program 

in  the  normal  way. 

frcir 

front 

to 

tack.  The 

expard/ccn tract  functions  allow 

h im  to 

ascend  or 

descend  the 

parse  tree  to  any  level.  For  e 

xarr.pl  e 

,  the 

body 

of  a  local 

procedure 

may  be  contracted  tc 

a  sing 

le  nc  r.- terminal.  This 

serves  to 

unclutter  the  display. 

and 

c  axes 

the 

hierarchical 

st  ruct u  re 

of  the  user's  pregram 

ap  pare 

n  t . 

1,3.3  Seiran  tic  Check 


While  most  pr  ogr  am  ir  ing  languages  define  an  identifier 
as  sc netting  like: 


<identifier>  <letter> 

|  <identifier>  <letter> 

<letter>  : :=  a  |  b  j  ...  |  z 

it  would  be  unacceptably  tedicus  to  expand  the  parse  tree 
for  each  identifier  a  character  at  a  time.  We  therefore 
permit'  the  entry  of  identifier  names  via  the  terminal's 
alpha run  eric  keyboard. 

While  entering  ar  identifier  name,  it  is  possible  to 
make  a  variety  cf  errors  that  cannot  he  detected  with  a 
context-free  grammar  alone.  We  therefore  maintain  a 
block-structured  symbol  table  and  provide  a  function  which 
traverses  the  parse  tree  and  checks  for  the  following  errors 
{in  the  context  cf  the  SUE  system  .language): 


S 


' 


. 


. 


1)  undeclared  identifiers 

2 )  duplicate  identifiers 

3)  misuse  cf  type  (e .  g .  use  of  a  type  as  a  numeric 

quantity) 

1.4  The  Project  SUB  st eni _  lancjuaa e 

Project  SUE  is  a  project  undertaken  ty  the  Computer 
Systems  research  Group  at  the  University  of  Toronto  [ Atwood 
at.  a  1 .  71].  Its  goal  is  the  design  and  implementation  cf  an 
efficient  and  reliable  operating  system  for  the  IBM  System 
360/370  family  cf  computers, 

Early  in  the  project,  it  was  decided  that  there  was  no 
programming  larguage  in  existence  suitable  for  the 
implementation  cf  the  operating  system.  Therefore,  a 
considerable  amount  of  time  was  spent  designing  a  high  level 
language  which  met  the  needs  cf  the  project  members.  The 
language  has  since  been  implemented  for  beth  the  360/370 
[Clark  71a,71b]  and  the  tDP/11  [Kalmar  73],  It  has  been 
successfully  used  in  the  implementation  cf  several  CSRG 
projects. 

The  language  has  several  features  (e.g.  macros)  which 
exerted  a  considerable  influence  on  the  design  of  cur  text 
editor.  We  will  therefore  provide  the  reader  with  a  brief 
description  of  the  language.  Ecr  a  more  complete  description 
the  reader  should  consult  the  above  mentioned  references. 
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and  the  ENF  grammar  in  appendix  I* 

1.4.1  Language  Design  Goals 

As  stated  by  Clark  [Clark  71a],  the  design  goals  c f  the 
SUE  system  language  were  based  on  the  following  premises: 

1)  The  language  must  facilitate  nicely  structured 
programs  and  data. 


2)  The  larguage  mist  be  readable. 

?)  The  language  must  assist  ir.  the  prevention  and 
detection  cf  bugs  and  logical  errors. 


4)  The  language  must  be  compilable  into  efficient 
machine  cede. 

5)  The  language  must  give  the  programmer  complete 
central  over  both  the  emitted  code  and  the  allocation 
cf  storage  and  registers,  when  he  wishes  it. 

6)  The  language  must  be  easily  modifiable. 


7)  It  must  be  possible  to  implement  an  efficient 
compiler  for  the  language  in  a  short  amount  of  time 
(six  ir-a n-ircr  ths)  . 
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1.4.2  language  rescript  ion 

A  program  written  in  the  SUE  language  consists  of  a 
number  of  hierarchically  related,  but  separately  compiled 
blocks.  There  are  three  types  of  blocks  in  the  language: 

CONTEXT,  DATA  and  PECGRAN.  A  context  block  defines  the 

/ 

constant  and  type  declarations  to  be  shared  ly  a  group  of 
procedures.  A  data  block  defines  data  to  be  shared  by  a 
group  of  separately  ccnpiled,  but  hierarchically  subordinate 
(local)  procedures.  A  program  block  contains  the  executable 
statements  of  a  procedure.  It  also  defines  data  which  is 
known  solely  by  that  procedure. 

1.4.2.  1  Data  Structures 

The  data  structures  in  the  language  are  largely  based 
oil  the  language  PASCAL  [  Wirth  7  1].  Dun  eric  variables  are 
defined  either  by  the  range  cf  values  they  iray  contain,  for 
ox am pi e : 

TYPE  car d__c c iurnn  =  (1  TO  80) 

or  as  a  fixed  length  bit  string.  The  user  may  define  his  own 
non-numeric  types  by  giving  a.  list  of  the  constants 
belonging  to  each  type,  for  example: 

TYPE  color  -  (r e d , b lu e , g r een , v icl e t , y e 1 lc w) 

The  compiler  disallows  conversions  between  unrelated  types. 
Optionally,  code  may  be  emitted  to  check  that  sub-ranges  are 


- 


. 


not  exceeded  at.  r u n - 1 i in 3  * 

Indirect  addressing  is  provided  by  defining 

ICIKTFP  T'C  <type> 

as  another  type.  This  binding  of  pc inter  verities  to  types 
allows  the  compiler  to  determine  from  context  when  a  pointer 
should  be  dereferenced  (followed),  and  still  allows  full 
type  checking.  The  programmer  may  explicitly  ccrtrcl 
dereferencing  by  means  of  the  £  operator. 

Data  structuring  is  dene  using  the  ARRAY,  RECORD,  GROUP 
and  FCKEBSZ1  types.  A  record  is  a  list  of  ordered  and  named 
types.  The  fields  of  a  record  variable  occupy  contiguous 
locations  in  storage  and  are  aligned  cr  hardware  dependent 
boundries.  A  group  is  similiar  to  a  record.  However,  its 
fields  are  packed  on  adjacent  bits,  and  it  may  be  referenced 
as  a  single  numeric  quantity.  It  is  used  to  describe  objects 
such  as  3  50  channel  command  words,  which  it  ay  be  treated 
either  as  a  collection  cf  fields  cf  as  a  single  64  bit 
value,  for  example: 

TYPE  ccw  = 

GROUT  El?  (54); 

BIT  (8)  (code); 

POINTER  ( da ta_ad dress)  ; 

ARRAY  (BIT  (6))  OF  BIT(1)  (flags); 

BIT  (2)  (zeros)  ; 

83  T  (8)  (unused  )  ; 

E I T  (16)  (count) 

I  ND 

Powersets  and  arrays,  as  is  the  case  with  pointers,  must  be 
bound  to  a  specific  type,  for  example: 
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FCWERSET  OF  color 
or 

A  BRAY  (  {C  TC  15))  of  tit  (3  2) 

A  powerset  has  as  its  possible  values  the  set  of  all 
possible  subsets  of  the  bound  type* 

1.4. 2. 2  Control  structures 

The  language  contains  a  rich  variety  of  ccntrcl 
structures,  consistent  'with  the  design  gcals  listed  in 
sect  ion  1.4.1. 

The  EE GIN  ...  END  compound  is  used  to  define  the  scope 
of  names  in  a  nested  fashion.  Its  tody  is  a  list  of 
executable  statements,  optionally  preceded  1 y  a  list  of 
local  declarations. 

The  DO  ...  END  compound  provides  bounded  repetition  of 
the  enclosed  statement  list.  Its  control  variable  may  be  a 
programmer  defined  type,  for  example: 

DC  c:=red  TC  green; 

0 

e 

*> 

E  ND 


The  CYCLE"  ...  END  compound  produces  an  unbounded 
repetition  cf  the  enclosed  statement  list.  The  loop  may  be 
controlled  by  the  EXIT,  RETURN  or  REPEAT  statements. 
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The  EXITS  Vs1  IT H  cciistruct  permits  a  compound  to  return  a 

value. 


The  tasic  selection  .mechanism  is  the  CASE  construct 
which  selects  cue  element  of  an  array  cf  executable 

i 

statement  lists,  based  cn  the  value  cf  an  expression.  The 

expression  may  be  of  either  a  numeric  cr  programmer  defined 

type,  for  example: 

CASE  color  TAG  c; 

blue:  <statement  list>; 
green:  <statement  list>; 

ELSE:  <statement  list> 


Case  alternatives  may  also  be  labelled  with  numeric  or 
programmer  defined  subranges,  for  example:  (C  TO  15)  cr  (red 
to  green).  ELSE  is  used  to  refer  to  all  remaining  constants 
of  the  type.  IF  is  permitted  as  an  abbreviation  for  CASE  (0 
TO  1)  TAG,  and  TEEN  as  a  synonyirm  for  1. 


The  macro  facility  allows  parametric  text  su bstitu tioiu 

For  example: 

MAC  EO  abs  (i )  ; 

EXITS  WITH  BIT  (32)  ; 

BEGIN 

IF  i <0 ; 

THIN:  EXIT  WITH  -i ; 

ELSE:  EXIT  WITH  i 

EKE 

END 

END  M A  CFO 

Macros  may  invoke  ether  macros,  but  recursive  macro 
definitions  are  disallowed. 


14 


The  INLINE  construct  causes  the  compiler  to  emit  a 
specific  machine  instruction.  For  improved  readability,  it 
may  be  used  within  a  macro  body.  The  name  of  the  macro  may 
then  be  chosen  to  give  semantic  meaning  to  the  hardware 
fur.cticn  being  performed.  For  example: 

MAC  EC  load  (  register , variable)  ; 

INLINE  ( n58 11 , r egi  ster  ,  var ia Lie ) 

-  EKE  MACEO 

or 


K AC FO  load_a bsclute  (register , b ase , dis p 1 acement)  ; 
INLINE  ( 11  ^  M ,  register, base,  displacement) 


ENE 

MACEO 

In  . the 

first 

example,  the 

second  parameter  is  a 

BIT  (32 

variable 

.  The 

base  register 

will  be  automatically 

provide 

by  the  compiler.  In  the  second  example,  the  programmer 
explicitly  specifies  the  base  register.  The  third  paramter 
must  be  a  const art. 
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2  Editor  Overview 

In  this  chapter  we  describe  hew  cur  text  editor  is  used 
by  the  p  rograrrrre  r .  In  particular,  we  show  hew  the  following 
operations  nay  he  performed: 

1)  Create  a  new  SUE  program  and  save  its  parse  tree. 

2)  Retrieve  the  parse  tree  of  an  existing  program  and 
mod  if y  it « 

3)  Define  and  reference  SUE  macros. 

4)  Verify  a  program  with  the  semantic  checker. 

2.  1  Display; 

2.1.1  2  2J50  Hardware 

The  2250  display  provides  four  com rc unicaticn  paths 
between  the  programmer  and  the  software.  They  are: 

1)  CFT  screen 

2)  alphanumeric  keyboard 

3)  light  pen 

4)  function  keys 

The  CRT  screen  normally  displays  all  or  part  of  the  SUE 
program.  When  the  program's  parse  tree  is  fully  expanded. 
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only  its  leaves  (terminal  symbols)  are  displayed.  If  the 
tree  is  rot  complete,  or  if  the  programmer  has  ascended  the 
tree  using  the  contract  function,  seme  nodes  (non- te  rrt:  in  a  1 
symbols)  will  he  displayed.  The  screen  is  also  used  to 
display  messages  to  the  programmer  when  serre  action  on  his 
part  is  required. 

'  The  keyboard  is  used  for  the  entry  of  identifier  names 
and  expressions.  The  pr ogrammer  may  insert  an  expression 
either  by  expanding  its  parse  tree  a  rede  at  a  time,  cr  by 
typing  it  directly  into  the  keyboard.  In  the  latter  case, 
the. editor  will  automatically  compute  the  expression's  parse 
tree,  and  notify  the  programmer  of  any  syntax  errors. 

The  light  pen  is  used  by  most  editor  functions  to 
select  a  parse  tree  node  or  leaf.  It  is  also  used  to  select 
production  alternatives. 


The  2250  has  a  second  keyboard  contain  inc  32  function 
keys  Each  key  has  a  lamp  within  it  that  may  be  switched  cn 


or  off  tc  indicate 

v  h 

et  her 

or 

not 

the 

key  is  active. 

Depressing  a  dark  key 

ha 

s  no 

e  f  f 

ec  t 

on 

th  e 

software.  Of  the 

32  available  keys,  only 

14  are 

u 

sed 

fry 

th  e 

editor.  The  keys 

have  the  following  names: 

1 )  E  X  F  A  N  D 

2)  CONTRACT 

3)  CCFY 

4)  DELETE 

5)  CHECK 

6)  QUIT 
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7)  NULL  CN 

8)  MULL  CFf 

•  9)  MAC  EC  PAFAMETER 

10)  M  A  C  B  C  MAMS 

1  1 )  N  S  H 

12)  OLD 

13)  FLIP  FCFEARD 

14)  FLIP  EACKWARD 

The  editor  has  5  basic  modes  cf  operation:  EXPAND, 
CONTE  ACT,  COPY ,  DLIXTK  and  CHECK  modes.  Each  node  is  entered 
by  depressing  the  corresponding  function  key.  Each  mode 
requires  the  programmer  to  select  one  or  mere  symbols  on  the 
screen  with  his  light  pen.  At  this  time  the  programmer  may: 

1)  Select  a  symbol  as  requested. 

2)  Change  nodes  (keys  1  -  5) 

3)  Terminate  the  editor  with  the  QUIT  key  (6) .  The 

parse  tree  will  be  saved  as  an  C/S  file. 

4)  Display  the  next  or  previous  page  (keys  13  and  14)  « 

5)  Turn  the  display  of  null  tree  leaves  ci>  or  off  (keys 

7  and  8) . 

Several  productions  in  the  SUE  grammar  have  null  right 
hand  sides.  Since  the  light  pen  only  responds  to  light  areas 
of  the  2250  screen,  it  is  not  possible  to  select  a  null  tree 
leaf.  If  it  becomes  necessary  to  do  this  (e.g.  to  delete 
it),  the  NULL  CN  key  may  he  depressed.  This  causes  all  null 
leaves  tc  be  displayed  as  the  symbol  » 
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Kher.  the  editor  is  first  started,  the  message 
'NEW  OF  CLD? 

is  displayed  or  the  screen.  Depressing  the  NEW  function  key 
causes  the  parse  tree  to  be  initialized  to  the  goal  symbol 
of  the  SUE  grammar.  Depressing  the  OLD  key  causes  the  parse 
tree  of  a  previously  saved  SUE  program  to  te  read  from  an 
O/S  file. 

2.2  Erparsior. 

Depressing  the  EXPAND  key  causes  the  message 
SELECT  NCN-1EDKIN  AL  TO  ES  EXPAND! E 
to  be  displayed.  The  programmer  now  selects  a  r.on-terrrina  1 
with  his  light  pen.  The  non- terminals  can  le  divided  into 
two  groups:  those  which  have  descendant  subtrees  (nodes), 
and  those  which  do  net  (leaves) .  The  non-terminal  leaves 
appear  underscored  on  the  display.  If  the  programmer  selects 
a  node,  it  is  replaced  (on  the  display  screen)  by  the 
terminal  and  non-terminal  symbols  cf  its  descendant 
production.  If  the  programmer  selects  a  leaf,  and  the 
production  associated  with  that  non- terminal  has  cnly  cr.e 
alternative,  replacement  can  be  made  immediately.  If  the 
production  has  twe  or  more  alternatives,  however,  they  are 
displayed  on  the  screen,  along  with  the  message 
SELECT  ALTEF NATIVE 

The  programmer  then  points  his  light  pen  at  the  desired 


19 


alternative.  In  any  case,  the  non-term  inal  originally 
selected  is  removed  from  the  screen,  and  the  terminal  and 
non- ter min al  symbols  of  the  production  alternative  appear  in 
its  place. 


2*3  Cor.t  racj;  ion 


Depressing  the  CONTRACT  key  causes  the  message 
SELECT  SYMBOL  TO  EE  CONTRACTED 

to  be  displayed.  The  programmer  now  selects  any  symbol 
(terminal  or  non-terminal)  with  his  light  pen.  This  symbol, 
all  symbols  in  the  production  forming  it,  and  all  symbols  in 
all  descendant  subtrees  are  deleted  from  the  screen.  They 
are  replaced  by  the  no n- terminal  symbol  imitediatly  amove  in 
the  parse  tree.  This  is  the  symbol  that  appears  cn  the  left 
hand  side  of  the  production. 


2.4  Deletion 

Text  cannot  be  deleted  from  the  SUE  program’s  parse 
tree.  Instead,  a  proper  subtree  may  be  deleted.  The  CCNTEACI 
function  must  first  be  used  to  replace  the  subtree  by  its 
head  non-terminal  node.  Recall  that  this  replacement  is  done 
on  the  screen  only;  the  parse  tree  remains  unchanged.  We 
require  the  redundant  pre-contract  operation  in  the  hope  of 
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preventing  the  programmer  from  inadvertantly  deleting  too 
large  a  subtree.  Next,  the  DELETE  key  is  depressed.  The 
messag e 

SELECT  NCR-TERMINAL  TC  EE  DELETED 
is  displayed.  The  previously  contracted  non- terminal  (or  any 
other  node)  is  then  selected  by  the  programmer  with  his 
light  pen.  The  entire  subtree  is  then  deleted.  Since  the 
selected  node  has  now  become  a  leaf,  it  is  re- displayed  with 
underscores  below  it. 


2.5  Copying 

A  proper  subtree  ray  be  copied  to  either  a  rson-terrr inal 
leaf  of  the  proper  syntactic  type  cr  to  cne  of  several 
temporary  storage  areas.  The  CONTRACT  function  must  first,  be 
used  tc  reduce  the  source  subtree  to  a  single  nen-ter rrinal 
node  on  the  display.  Next,  the  COPY  key  is  depressed.  The 
messag  e 

SELECT  SOURCE  N 0 N -TER  El UAL 

is  displayed.  If  a  subtree  has  been  previously  copied  to  the 
first  temporary  storage  area,  the  message 
TEMPORARY  STORAGE  AREA  1  = 

is  displayed,  with  the  syntactic  type  cf  the  subtree  between 
the  met a- bracket s  (<  >).  Otherwise  the  message 
TEMPORARY  STORAGE  AREA  1=  <  E  M  E  T  Y  > 
is  displayed.  A  similiar  message  is  displayed  for  the 
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remaining  temporary  storage  areas.  The  programmer  now 
selects  either  a  non  - terrr inal  node  or  the  temporary  storage 
area  with  his  light  pen.  Next,  the  message 
SELECT  DESTINATION  NC  N-TEE  MINA1 
is  displayed.  The  programmer  selects  either  a  non-terminal 
leaf  or  the  temporary  storage  area  with  his  light  pen.  The 
entire  subtree  descended  from  the  source  non-terminal  is 
duplicated  below  the  dcs t ina t icr .  If  the  destination  is  the 
temporary  storage  area,  any  subtree  previously  saved  there 
will  be  lost.  The  source  and  destination  must  always  be  of 
the  same  syntactic  type,  or  the  copy  operation  will  not  be 
performed.  Since  the  destination  leaf  has  how  become  a  node, 
the  underscores  beneath  it  are  removed  frcm  the  display. 

2.6  Macros 

Macros  rust  be  treated  as  a  special  case,  since  the 
arbitrary  placement  of  a  macro  reference  cannot  be  expressed 
by  a  context  free  grammar.  In  order  to  allow  us  to  store 
macro  references  directly  in  the  parse  tree,  we  have  imposed 
the  following  minor  restrictions  cn  the  SUE  language: 

1)  A  macro  body  must  have  a  corresponding  parse  tree 
with  a  single  head  node. 

2)  The  text  string  to  be  substituted  for  a  rracro 
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parameter  irust  have  a  corresponding  parse  tree  with  a 
single  head  node. 


For  example 


N  A  C  F  0  interchange ( x  ,  y )  ; 

BEGIN 

EECLAEE 

BIT  (32)  (temp)  ; 
t  e  in  p :  =  x ; 
x  :=y  ; 
y  :=temp 

END 

END  M A  CEO 

is  a  valid  macro  with  a  head  node  of  <ccmpcund>  (see 
appendix  I).  However, 


FACED  illegal; 

END  ; 

EEGIN 
END  MACFC 

violates  the  above  restriction.  It  is  not  possible  to  form 
such  a  construct  with  our  editor.  We  feel  this  restriction 
is  a  snail  price  to  pay  for  the  guarantee  of  syntax  error 
free  programs.  This  restriction  also  enforces  good 
prog  ran  rr  ing  style. 


B  aero 

de'f  i  nit  i 

ens  are  created 

u  s  i  n 

Q 

the  E 

XI  AND 

function . 

W  her  the 

programmer 

atterr  p 

t  s 

t  C 

expan d 

t  h  e 

non- terminal  Crracro 

body>,  he 

is  asked 

to 

choose 

the 

syntactic 

type  of 

the  macro 

body's 

parse 

tree . 

The 

programmer 

recursivly  expands 

the 

prod 

net  ion  Crracro 

parameter  list>  to  specify  the  number  of  macro  parameters, 
and  to  assign  names  to  them. 
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When  the  programmer  wishes  to  insert  a  parameter  in  the 
macro  body,  he  depresses  the  MACRO  PAFAMETEB  key  immediatly 
before  selecting  a  non-terminal  leaf  with  his  light  pen.  The 
messag  e 

ENTS B  N  A C  F  C  P  A  E  A M  ETER 

is  displayed.  The  name  of  the  desired  parameter  is  entered 
using  the  keyboard.  This  operation  implicitly  assigns  a 
syntactic  type  to  the  parameter,  and  all  subsequent  uses  of 
the  parameter  will  be  type  checked. 

To  reference  a  macro,  its  definition  must  be  at  least 
partially  complete.  All  parameters  (if  any)  must  be 
implicitly  typed,  and  its  body  must  have  teen  assigned  a 
type.  The  prcgranmer  depreses  the  MACRO  NAME  key  irarnediatly 
before  selecting  the  non-terminal  tree  leaf  tc  be  replaced 
by  the  macro  reference.  The  message 

ENTER  MAC  E C  NAME 

is  then  displayed.  The  programmer  enters  the  macro  name 
using  the  keyboard.  The  non- terminal  is  then  replaced  by  the 
macro  name  and  parameter  list.  For  example  if  the 
non- terminal  <ccnpound>  were  replaced  by  a  reference  to  the 
macro  "interchange”,  the  display  would  be  updated  to  read: 

INTERCHANGE  (<S TO RAGE  EEFEFENCED,  < ST 0  RAGE  REFERENCED) 
The  two  parameters  may  then  be  expanded  in  the  normal  way. 


. 


. 


2.7  Semantic  Checker 


The  programmer  invc-kes  the  checker  ty  depressing  the 
CHECK  key.  The  editor  responds  with  the  message 
SELECT  N C  N - 1 E B ft I N A L 

The  pr  og  i  a  miner  then  selects  a  non-terrrinal  node.  The  ertire 
descendert  subtree  of  that  node  is  then  decked  for  semantic 
errors.  To  check  the  entire  program,  it  must  be  reduced  to 
its  goal  symbol  using  the  CCKTBACT  function.  Any  semantic 
errors  detected  are  displayed.  A  complete  description  cf  the 
types  cf  errors  detected  is  given  in  section  3.5. 


3  Internal  Structure 

This  chapter  describes  the  internal  structure  cf  cur 
editor.  The  data  structures  used  tc  represent  the  grammar, 
the  programmer’s  parse  tree,  and  the  symbol  table  are 
described  in  detail.  An  informal  description  of  the 
algorithms  used  tc  manipulate  these  data  structures  fellows. 
Finally,  a  brief  description  of  the  semantic  checker  is 
given . 

The  editcr  itself  is  a  fairly  conventional  FI./I 
program.  When  compiled  by  the  CS-FL/I(F)  compiler  [IBM  73], 
it  runs  in  a  15CK  byte  partition  under  CS/KFT.  This  compares 
favorably  with  Hansen’s  EMILY,  which  requires  60K  bytes  of 
core,  plus  400K  bytes  of  Large  Capacity  Storage. 

3.1  Represent at icn  of  Grammar 

We  have  written  a  separate  program  which  reads  the  SUE 
grammar  from  cards  in  a  format  similiar  to  EKF  and  creates 
the  data  structures  described  below.  This  makes  it  a 
relativly  simple  task  to  modify  the  editor  for  use  with 
another  programming  larguage.  The  only  parts  cf  the  editor 
which  are  SUE  dependent  are  the  semantic  checker,  the  macro 
processor,  and  the  symbol  table  management  routines. 
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The  SUF  grammar,  as  shown  in  Appendix  -I,  has  teen 
extended  to  show  the  nesting  of  Data  ard  Program  blocks. 

There  are  three  major  data  structures  used  to  describe 
the  SUE  language  grammar.  They  are  the  grammar  symbol  table, 
the  symbol  descriptors  and  the  production  descriptors. 

3.1.1  Grammar  Symbol  Table 


The  grammar  symbol  table  is  shown  in  figure  3.1.  It  is 
an  array  containing  one  entry  for  each  terminal  (Ti)  and 
non-terminal  (Ni)  symbol,  in  the  grammar.  Each  entry  is  a 
pointer  to  a  symbol  descriptor.  Whenever  the  parse  tree 
references  a  symbol  in  the  grammar,  it  is  always  done  by 
means  of  a  numeric  index  into  the  g.rair.mar  symbol  table, 
instead  of  using  a  pointer  to  the  symbol  descriptor 
directly.  This  allows  the  symbol  descriptors  to  be  assigned 
different  memory  addresses  on  subsequent  runs  of  the  editor. 


1  i  1  !  !  i  1  I  i 

|  TI  ]  T2  |  ...  |  T  n  {  If  1  |  N  2  i  ...  |  Urn  | 

III  i  !  i  i  II 

figure  3.1 


3.1.2  Symbol  Descriptors 
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There  is  ere  symbol  descriptor  (figure  3.2)  for  each 
terminal  and  non-terminal  symbol  in  the  grammar.  A  symbol 
descriptor  contains  the  text  string  of  the  symbol  it 
represents.  In  addition,  the  symbol  descriptor  for  a 
non-terminal  symbol  contains  a  pointer  tc  a  production 
descr i p  t  or „ 


1  I  1 

(  pointer  to  production  descriptor  |  text  cf  symbol  | 
j  (if  terminal)  cr  null  j  J 


figure  3.2 

3.1.3  fr eduction  Eescr inters 

There  is  one  production  descriptor  (figure  3.3)  for 
each  alternative  of  every  production.  A  production 
descriptor  contains: 

1)  the  text  string  of  the  right  hi  and  side  of  the 
production  alternative.  The  text  is  only  used  for 
display  purposes;  it  is  never  scanned  by  the  editor. 

2)  a  pointer  to  the  production  descriptor  cf  the  next 
alternative  of  the  production.  Ihis  field  is  null  for 
the  last  production  descriptor  in  the  list. 


. 


3)  a  pointer  to  a  linked  list  of  grammar  symbol 
numbers.  These  numbers  are  mapped  to  the  actual  syrrbcls 
they  represent  by  means'  of  the  gramnar  symbol  table. - 

Thus,  the  production  descriptors  for  each  production  form  a 
linked  list  of  linked  lists. 


]  i  i  I 

|  pointer  to  next  |  pointer  tc  linked  |  | 

j  production  descriptor  1  list  of  grammar  }  text  string  j 

|  (or  null)  1  symbol  numbers  |  j 

I  I  II 


figure  3.3 


3.  1 . U  Example 


As  an  example,  let  us  show  how  the  simple  grammar 
defined  ir.  section  1.2 


<LI ST> 


1 

<LI ST>  ,  T 


would  be  represented  : 
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Grammar  Symbol  Table 


V  V  V 

- - !  J ----  I  !  -  - 


I  ry>  I 


FUL 


I  5  /  ‘ 

1  1  i 

, - i  i 

I  I  I 

J  FULL  j  | 

I  1  I 

! 


j  |  *  <  L I S  T  >  '  J 


- ■  |  Symbol  Descriptors 

I 


I 


y 


1  rp  t 


8 

I 

! 

i 


I 

v 


FULL 


> 


’  <  1 1 S  T  >  ,  T  ’ 


FULL 


Production  Descriptors 


1~~ 


I--”" I 
I— >1  1  ! 

I  j  - |  Linked  List 

j  |  FULL |  Grammar  Syrr 

I-  —  -  I 
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tr  hi 


3.2  P£E£eserit ati cn  of  Parse  Tree 

Figure  3.4  shov/s  the  contents  of  a  parse  tree  node/leaf. 
Since  ve  want  tc  permit  the  editing  of  arbitrarily  large 
trees,  the  entire  tree  is  not  held  in  core  storage.  Storage 
for  the  parse  tree  is  allocated  in  fixed  length  pages.  A 
parse  tree  node  or  leaf  is  referenced  by  a  "virtual 
address",  consisting  of  a  page  number  and  an  offset  within 
that  page.  Currently,  up  tc  five  pages,  each  1024  bytes 
long,  are  kept  ir.  core.  The  remaining  pages  are  kept  cn  an 
0/S  direct  access  file.  A  "least  recently  used"  algorithm 
controls  page  swapping,  as  suggested  by  Knuth  [Knuth  73]. 

The  parse  tree  is  implemented  as  a  "threaded"  binary 
tree  [Knuth  68],  Those  nodes  and  leaves  cf  the  tree  to  he 
displayed  on  the  2250  are  linked  together  with  the  display 
links  shewn  in  figure  3.4.  Ail  parse  tree  nodes  and  leaves 
contain  a  pointer  to  the  current  level  block  table  head  (see 
section  3.4) .  This  permits  the  scope  of  identifiers  tc  be 
quickly  determined. 
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symbol  number 


te nr  ina  1/non- 1  errr  in al  ilag 


threaded  link  flag 


left  link  |  right  (thread)  | 

|  link  ] 


hack  )  forward 

display  link  j  display  link 


|  blcck  table  head  pointer 


figure  3.4 


We  have  added  several  special  terminal  symbols 
SUE  grammar.  They  are  used  to  control  paragraphing, 
table  management,  and  macro  definitions.  They  appear 
parse  tree,  but  are  net  displayed.  The  following  is 
of  these  symbols  and  their  functions. 


to  the 
symbol 
in  the 
a  list 


$  in 
$  ex 
$sp 
$  r  1 


indent 

exdent 

space 

start  new  line 


$sl  -  start  scope  block 
$el  -  end  seeps  blcck 


$ 1 1  -  type  definition 

$tc  -  constant  definition 

3 1  v  -  variable  definition 

$te  -  exit  label  definition 

Itm  ~  m aero  name  definition 

$tq  -  macro  parameter  definition 


$gi 
Sen 
$  cs 
Sit 
$lc 
Ue 


accept 
accept 
accept 
accept 
accep t 
accept 


identifier  frem  keyboard 
number  from  keyboard 
character  string  frenr  keyboard 
type  identifier  from  keyboard 
consta nt .  iden t if ier  from  keyboard 
exit  identifier  from  keyboard 


Jlv  -  accfpt  variable  identifier  frem  .keyboard 
$ rr b  -  get  type  for  macro  tody 

$kfc  -  accept  <eXpression>  frem  keyboard  and  parse  it 


3.3  Manipulation  of  Parse  Tree 

We  will  now  give  a  brief  description  cf  the  algorithms 
used  tc  implement  the  editor  functions  described  in  chapter 
two. 

3.3.1  Tree  Node/ leaf  Identification 

It  is  naccessary  to  be  able  tc  associate  a  light  pen 
"hit”  on  the  225C  screen  with  an  actual  rode  or  leaf  in  the 
parse  tree.  When  the  editor  wishes  tc  display  a  symbol  cn 
the  2250,  it  first  moves  the  text  to  a  core  image  of  the 
terminal's  buffer.  The  buffer  address  and  the  virtual 
address  (page  number , of f set)  of  the  parse  tree  node  or  leaf 
corresponding  to  the  symbol  are  saved  in  a  table.  The  buffer 
image  is  then  transferred  to  the  2250.  When  the  programmer 
points  the  light  pen  at  the  screen,  the  2250  hardware 
records  the  buffer  address  of  the  character  selected.  By 
means  of  the  aforementioned  table,  the  editor  can  retrieve 
the  parse  tree  address. 


3.  3.  2 
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The  parse  tree  is  displaced  by  following  the  forward 
display  links.  The  grammar  symbol  table  and  the  symbol 
descriptors  are  used  to  convert  the  symbol  numbers  in  the 
tree  to  character  strings.  The  display  procedure  also 
interprets  the  paragraphing  symbols  (Irl,  Sin,  Sex  and  $sp)  . 

Page  flipping  is  done  by  following  either  the  forward 
or  backward  display  lirks  unitl  the  required  number  of  lines 
have  been  passed  (there  are  currently  40  lines  per  page).  A 
new  line  is  started  by  either  a  Snl  syiibcl  cr  by  a  string  of 
symbols  that  exceeds  the  line-length  of  the  screen. 


3.3.3  fxpans ion 

If  the  left  link  (see  fig.  3.4)  cf  the  selected  symbol 
is  not  null,  it  is  a  tree  node.  It  is  cr.ly  necessary  to 
alter  the  display  links  on  the  level  below  the  node  (pointed 
at  by  the  left  link)  tc  fellow  the  parse  tree  links. 

If  the  left  link  is  empty,  we  rrust  expand  the 
production  associated  with  this  symbol  (the  symbol  must  be  a 
non- term ir al) .  The  tree  leaf  symbol  number  is  used  as  an 
index  into  the  grammar  symbol  table.  This  gives  the  address 
of  the  symbol  descriptor,  which  in  turn  gives  the  address  cf 
the  production  alternative  descriptor.  If  the  link,  field  of 
the  production  alternative  descriptor  is  null,  there  is  cnly 
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one  alternat ive  associated  with  this  production.  Otherwise, 
the  text  strings  associated  with  the  alternatives  are 
displayed  or,  the  screen,  and  one  is  selected  by  the 
programmer.  Hew  tree  leaves  are  allocated  and  initialized  by 
copying  the  necessary  information  froir  the  linked  list  of 
symbol  numbers. 

3.3.4  Contraction 

-The  tree  is  traversed  by  following  the  right  1  inks 
until  a  "thread"  link  is  found,  following  this  link  brings 
us  to  a  node  that  is  up  one  syntactic  level  ir.  the  tree.  'The 
display  links  are  then  altered  tc  skip  ever  the  section  of 
the  tree  below  this  node. 

3.3.5  Deletion 

The  entire  subtree  below  the  selected  node  is 
traversed,  and  each  node  or  leaf  is  returned  tc  the  free 

list . 

3.3.6  Copying 

This  is  done  in  a  straightforward  rranner  by  traversing 
the  source  subtree  and  allocating  new  nodes  which  are  linked 
beneath  the  destination  node. 
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3.3.7  Expression  Parser 

When  a  $kh  symbol  is  found  cn  the  right  side  of  a 

production,  the  expression  parser  is  invoked.  The  programmer 

/ 

is  asked  to  enter  an  <expressicn>  frerr  the  keyboard,  and  it 
is  parsed  top  down  (recursive  descent) .  All  identifiers 
referenced  must  have  been  previously  declared,  so  that  the 
parser  can  correctly  apply  the  production 

<storage  reference>  : : =  <variahle  identifier> 

|  <prccedure  identifier> 

J  ... 

3.4  Syrrhcl  Table  Mar. agement 

In  this  section  we  describe  the  handling  of 

identifiers,  constants  and  character  strings. 

3.4.1  Block  Structure 

A  scope  block  is  defined  in  the  SUJI  grammar  by  a 
$sl...$el  pair  (see  Appendix  I).  When  the  programmer  expands 
a  production  containing  these  symbols  is  expanded,  a  block 
ho  ad  is  created  for  the  new  scope  block.  The  block 
table  head  contains  a  pointer  to  a  linked  1 ist  of  identifier 
names  (the  block  s_yrrbol  table)  declared  in  this  block 
(initially  null),  and  a  pointer  to  the  block  table  head  of 
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the  next  outermost  (enclosing)  block.  Block  table  loads  and 
symbol  tables  are  allocated  in  the  same  way  as  parse  tree 
nodes  and  leaves  (i.e.  they  have  virtual  addresses) . 

3.4.2  Identifier  leclaraticii 


When  the  programmer  attempts  to  expand  the  production 

<ident  j. fier>  :  :  =  $gi 

he  is  asked  to  enter  the  identifier  name  ficrr  the  keyboard. 
The  nane  is  added  to  the  block  symbol  table  for  the  current 
(innermost)  block.  A  check  is  made  for  duplicate  names.  A 
type  (compound,  constant,  procedure,  exit,  variable)  is 
assigned  to  the  identifier  depending  or.  the  special  type 
symbol  in  the  ancestor  production  (iti,  $tc,  $tp,  $te,  Itv) . 
The  left  link  of  the  $gt  tree  leaf  points  at  the  entry  in 
the  blcck  syrrbcl  table. 

3.4.3  Identifier  Reference 

All  productions  which  refer  to  identifiers  (ether  than 
declarations,  type  or  constant  definitions)  require  an 
identifier  of  a  particular  type  (e.g.  Cccrstant  identifier> 
or  <procedure  ident if ier> ) .  When  the  programmer  expands  a 
production  which  contains  any  of  the  symbols  it  1 1 ,  lie,  $lp, 
$le  or  ilv  (see  Appendix  I),  he  is  asked  to  enter  an 
identifier  name  from  the  keyboard.  The  blcck  symbol  tables 
for  the  current  block,  and  all  enclosing  blocks,  are 
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searched  for  the'  name.  If  the  name  is  net  fount!,  a  warning 
message  is  issued  and  it  is  inserted  in  the  block  symbol 
table  of  the  current  (innermost)  block.  The  entry  is  marked 
"undeclared"  so  that  a  subsequent  attempt  to  declare  it  will 
not  iae  treated  as  a  duplicate  declaration .  The  left  link  of 
the  identifier  leaf  is  set  tc  point  at  the  entry  in  the 
block  symbol  table. 

3.4.4  Constants  and  Strings 

When  the  programmer  expands  the  productions 

<n umber >  ::=  icn 
or 

<string>  : : =  Icn 

he  is  asked  tc  enter  a  numeric  or  string  constant  from  the 
keyboard.  Space  is  allocated  fer  the  text  string  entered  and 
it  is  pointed  at  by  the  left  link  cf  the  Sen  or  $cs  tree 
leaf.  Multiple  appearances  of  a  constant  are  stored 
separately.  Hopefully,  the  SUE  symbolic  constant  definition 
facilty  will  keep  multiple  constant  appearances  to  a 
m  i  n  i  m  u  ra  „ 

3.4.5  Deletion  and  Corjying 

When  a  subtree  is  deleted,  care  rrust  be  taken  that  the 
integrity  of  the  block  symbol  table  is  maintained.  A 
reference  count  is  kept  in  each  table  entry.  If  the  deleted 
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subtree  contains  the  only  reference (s)  t.c  an  -  identifier, 
then  that  identifier  must  he  deleted  from  the  block  symbol 
table . 


When  a  subtree  is  copied,  all  identifiers  in  that 
subtree  irust  be  re -linked  tc  the  proper  block  symbol  table. 
It  is  possible  that  a  copy  operation  .may  result  in  duplicate 
and/or  undeclared  identifiers.  If  so,  a  warning  message  is 
issued,  but  the  copy  operation  is  allowed  to  proceed. 


3.5  Semantic  Checker 

A  limited  amount  of  semantic  checking  may  be  done  on 
the  entire  parse  tree,  cr  on  any  s.ultree.  The  semantic 
checker  traverses  the  entire  subtree  below  the  selected  node 
and  performs  the  following  checks: 

1)  Macro  references  and  macro  definitions  must  agree  in 
parameter  type  and  number. 

\ 

2)  All  identifiers  must  be  declared 

3)  All  identifiers  must  be  of  the  proper  type  (e.g.  a 
type  identifier  may  not  be  used  in  an  expression) 

When  an  error  is  detected,  an  appropriate  message  is 
displayed  on  the  screen.  The  programmer  may  then  point  the 
light  pen  anywhere  on  the  screen  tc  continue  checking,  cr 
terminate  the  checking  (and  hopefully  correct  the  errors)  by 
selecting  another  function  key. 
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Obviously,  a  far  more  sophisticated  checker  could  be 
irrplemen  ted.  Ultimately,  the  checker  could  be  replaced  by  an 
interactive  compiler.  Such  a  compiler  would  not  need  a 
scanner,  parser,  symbol  table  or  any  syntax  error  recovery 
mechanisms. 
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4  Conclusion 


In  the  preceding  chapters,  we  have  described  the  use 
and  internal  structure  of  our  editor.  A  working  version  is 
currently  available  at  the  University  of  leronto  Computer 
Research  Facility.  In  this  chapter,  we  will  discuss  what  we 
have  learned  as  a  result  of  our  work.  We  will  also  suggest 
some  areas  for  future  research. 


4.1- Cost  Effectiveness 

There  are  two  questions  which  must  he  answered  when 
evaluting  any  programming  system  such  as  our  editor: 

1)  Does  the  system  make  effective  use  of  the 
programmer's  time? 

2)  Can  the  system  be  justified  in  terms  of  the  cost  of 
the  computing  resources  needed  to  support  it? 

The  first  question  is  a  difficult  ore,  as  its  answer 
depends  on  the  personal  preferences  cf  the  indivaual 
programmer.  Ideally,  a  detailed  investigation  should  be 
conducted,  siioiliar  to  Hackman's  time-sharing  vs.  batch 
study  [Sac km an  7Cj.  Due  to  financial  and  time  constraints. 


our  experience  with  the  oditcr  has  been  limited  to  a  few 
simple  programs.  We  have,  however,  been  able  to  make  the 
follow inq  observations : 

1)  It  takes  somewhat  longer  to  enter  a  program  with  our 
editor  than  with  a  conventional  interactive  text 
editor.  However,  forcing  the  programmer  to  carefully 
consider  the  hierarchical  structure  of  his  program 
seems  tc  help  prevent  logic  errors. 

2)  Our  editor  is  especially  useful  with  a  1 anguage  like 
SUE,  which  is  defined  in  the  various  papers  {and 
learned  by  programmers)  in  terms  of  its  BNF  grammar. 

3)  Cur  editor  is  a  excellent  tocl  for  teaching  both  the 
SUE  language,  and  the  ccncept  of  ENE  grammars. 

Unfortunately,  the  cost  cf  interactive  graphics  (about 
$100  per  hour)  cr.  the  computing  facility  used  to  develop  cur 
editor  make  it  financially  impractical  tc  use.  Inis  high 
cost  is  primarily  due  to  the  fact  that  the  computer  operates 
in  an  CS/MFT  hatch  environment,  and  the  use  of  an 
interactive  program  ties  up  a  memory  partition  for  a  long 
period  cf  time.  If  the  editor  were  run  in  a  true 
time-sharing  environment,  the  cost  would  be  much  lower. 
Connect  time  charges  for  commercial  time-sharing  services 
are  generally  cf  the  crier  of  $10  per  hour.  The  use  of  a 
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less  sophisticated  graphics  terminal  wculd  further 
costs. 


4.2  Possible  Extensions  and  Improvements 

Our  editor  is  most  deficient  in  its  handling  of  lists, 
lists  are  expressed  in  the  SUE  grammar  by  use  of  both  right 
and  left  recursive  productions.  Fight  recursive  productions 
are  ircre  natural  frcm  the  programmer's  point  of  view, 
because  they  allow  him  to  enter  the  list  items  from  left  to 
right.  However,  adding  an  extra  element  to  the  middle  of  a 
list  is  difficult  in  either  case,  and  involves  the  roving  of 
subtrees.  Hansen  has  incorporated  a  list  defining  mechanism 
into  Inily’s  syntax  definition  formalism  [Hansen  71a'J  which 
places  all  list  elements  or,  the  same  syntactic  level  and 
facilitates  the  insertion  or  deleticr,  cf  list  elements. 
However,  use  of  such  a  mechanism  creates  other  problems.  It 
involves  drastically  changing  the  SUE  grammar.  We  are  then 
left  with  the  difficult  problem  of  proving  that  the 
modified  and  original  grammars  both  define  the  same 
language.  Even  if  we  ignore  this  problem,  Hansen's  method 
still  presents  seme  problems.  It  only  werks  correctly  if  all 
nested  list  constructs  have  unique  separators.  It  would  not 
work  in  the  case  of  SUE  where  several  list  types  use  the 
semicolon  (;)  as  the  list  separator. 


- 
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The  handling  of  identifiers  could  he  improved.  Linking 
all  instances  of  a  particular  identifier  together  would 
allow  the  irplerentat ic n  cf  a  cross  reference  facility. 
Maintaining  an  auxiliary  type  stack,  as  done  in  Clark's 
compiler  [Clark  71a],  would  allow  the  semantic  checker  tc  do 
full  type  checking.  It  is  difficult  tc  decide  just  how  rruch 
checking  should  be  done  by  the  editor,  and  how  much  is  best 
left  to  a  compiler. 

The  programmer  could  be  allowed  to  select  identifier 
names  from  a  list  of  declared  identifiers  displayed  on  the 
screen.  He  could  be  given  the  eptien  cf  allowing  only  local 
variables  to  be  displayed,  fer  as  he  got  deeper  into  his 
program,  the  number  of  accessible  identifiers  would  become 
quite  large.  Currently  the  programmer  always  enters 
identifiers  from  the  keyboard.  This  feature  nust  be  retained 
in  any  case,  so  that  he  may  use  an  identifier  before  it  has 
been  declared, 

\ 

If  the  editor  is  tc  be  used  to  modify  existing  SOS 
programs,  it  is  necessary  to  enter  them  by  hand,  using  our 
editor  to  create  the  parse  tree.  A  separate  program  should 
be  written  which  reads  a  SUE  program  in  card  image  form,  and 
creates  an  C/S  file  containing  the  parse  free  and  symbol 
tables  which  may  be  subsequently  read  by  our  editor.  Such  a 
program  would  be  fairly  easy  to  implement,  using  the  BFL 
skeleton  and  scanner  of  Clark's  compiler. 
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parser  turned  out  to  be  a  most  useful 


•The  expression 

feature.  It  -would  be  a  worthwhile  experiment  to  implement 
the  automatic  parsing  of  ether  nen-ter  itinals  in  the 
language ,  for  example  Cexecutahle  statements,  declaration 
item>  and  all  ether  non- terminals  derived  frem  them.  If  this 
idea  is  carried  to  extremes,  the  result  is  a  text  editor 
which  accepts  text  on  a  statement  by  statement  basis,  but 
stores  it  internally,  and  allows  the  programmer  to  view  it, 
in  parse  tree  fora,. 
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The  Project  S91  System  Language  Grarrirat 

Those  terminal  s  yin  be  Is  beginning  with  the  character  *  $ 1  are 
used  tc  control  paragraphing  and  other  semantic  editor 
functions.  They  are  not  part  of  the  language. 

<goal>  ::=  <context  part>  $sl  <goal> 

1  <proceaure  blcck> 

<procedure  block>  :  : <data  part>  $sl  <frccedure  tlock>  lei 

<prcgram  part> 

I 

<data  part>  <data  name>  ;  $  in  $nl  <data  llcck>  lex  $nl 

_  I  _  $nl 

<context  part>  : : =  <context  nane>  ;  tin.  $nl  <ccntext  blcck> 

lex  $ n 1  _!_  Snl 

<program  part>  :  <program  name>  ;  $in  $ r.  1  <scope  block> 

lex  $nl  _ | _  $nl 

<data  name>  :  :  -  <data  name  head>  <pararr.e  te  rs>  <returns> 

<data  name  head>  ::=  DATA  $sp  <identifier>  Itp 

<parameters>  : (  <identifier  list>  ) 

I 

<returns>  $s p  returns  {  <identifier  iist>  ) 

I 

<identifier  list>  ::=  <identifier>  $tv 

|  Cidentifi'er  list>  ,  $sp  <i denti tier >  $tv 

<data  blcck>  ::=  <de fin i t j.cr >  ;  $nl  <data  blcck> 

I 

<def  ir,  it  ion>  :  :=  <declar  ation>  lex 

|  <template> 

<declarat.ion>  ::=  DECLARE'  Sin  $nl  <declaraticn  item> 
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!  <decldrtion> 


f 


3nl  declaration  item> 


declaration  iteiO  <declaraticn  type  (  <identifier  list>  ) 


<template>  :  ;  -  <iracrc  bead>  3 si  <inacxo  pararreters>  ;  $.in  ini 

Cmacro  bodj>  Sex  $r.l  3el  END  Ssp  MACRO 
j  TYPE  $sp  <identifier>  Stt  Ssp  -  ’Ssp  <type> 

|  CONSTANT  Ssp  <ident if ier>  3tc  $  £  p  -  $sp 
< unsigned  ccnstant> 

<macro  head>  ;  MACHO  $  s  p  <rdentifier>'  $tm 
<macrc  parameteis>  (  <macrc  parasatec  list>  ) 


Cmacro  parameter  list>  <identifier>  Stq 

|  <macro  parameter  list>  ,  Ssp 
<identifier>  3 1  g 

Cmacro  tcdy>  ::=  Smb 

<type>  :  declaration  type> 

|  $in  Sri  <racord)  3in  3nl  <field  list>  Sex  Sex.Snl  END 
|  Sin  3nl  <grorp  head>  ;  Sin  Sr.l  <field  list>  Sex  Sex 
Snl  END 

<record>  ::=  F5CCHD 


<group  head> 


GHOUP  Ssp  < brief  type> 


<declaraticn  type> 


<br ief  ty pe> 

A  HE  A  3s p  <rurrber> 

PBOCE'DUFE  Cpararoet.er  type>  <return  type> 
INTERRUPT  3sp  HANDLING  Ssp  FFOCEDURE 


<brief  type>  : Cccmpound  type  identifier> 

|  <  i  n  d  e  x  t  y  p  e  > 

j  POINTER  Ssp  TO  Ssp  Cpc inter  identifier> 

|  POWER SET  Ssp  CE  Ssp  < index  ty pe> 

|  ARRAY  {  Cindex  iist>  )  Ssp  CF  Ssp  <fcrief  type> 
|  <attribute>  Ssp  Ctrief  type> 


<pcinter  ider:tifier>  <index  type  identifier^ 

|  <cciBpcuRd  type  identifiers 
1  <procedure  identifier> 

|  <variahle  identifier> 

<index  type>  <index  type  identifier^ 

]  BIT  (< number >  ) 

|  CHARACTER  (  < c u  m fc e r >  ) 

!  (  < identifier  list>  ) 

|  (  <expression>  Ssp  TO  Ssp  <expression>  ) 

j  (  Cexpression>  Ssp  TO  Ssp  *  ) 


50 


. 


% 


■ 


<constant> 


=  <unsigned  ccnstant> 

|  < adding  operator)  <ruirber> 


<ur.signed  constant) 


<n  umber) 

<str ing> 

<ccnstant  identifiers 


< index  list>  :  :  =  <index  type> 

]  <  i  n  d  e  x  1  i  s  t  > 


i  s  p  < i  13 d s x  t  y  p  e  > 


<parametGr  type>  ::=  $sp  ACCEPTS  (  <type  list>  ) 


<return  type>  -  is p  EEIUBNS  (  <type  list>  ) 

j 

<type  list>  <declaraticn  type> 

|  <type  list>  ,  $sp  declaration  type> 

<field  list>  ; :=  <field  declaraticiS 

j  <variant  part> 

j  < field  declarations  ,  Ssp  < f i e 1 d  1 i s t > 

<£ield  declaraticn>  ::=  <type>  {  ^identifier  list>  ) 

I 


< variant  part> 


<variant  head>  ;  Sin  Sri  <varia.nt  list> 
Sex  ini  END 


<var iant  hea d> 


SE  Ssp  < i n d e x  type> 
$tv 


$sp  1  AG  $ s p 


Cvariant  list>  : <else>  <field  list> 

j  <variant  late ls>  <f ie Id  1 i st> 

|  Cvariant  1  a  be  1  s  >  <field  'list>  ;  Sri 
<va riant  list> 


<  e  1  s  e  >  ELSE  :  Ssp 

<variar;t  labels>  <label> 

|  <variant.  labels  >  < la  b  el> 


<label>  THEN  :  $sp 

j  <constant>  :  „.$sp 

|  <ccr.£tant>  $sp  10  $sp  < constants  :  Ssp 


<at.  tributes  BEGISTSR 

|  EAST 
|  AIGINED 

|  ALIGNED  Ssp  <uumber>  ,  Ssp  <nuir;ber> 
<conte>t  naine>  CONTEXT  Ssp  <identifier>  Stu 

<context  block>  :  <context  declaraticn> 
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|  Eontext  block>  ;  Sri  Ecnte-xt  declaration> 

< con text  declaraticn>  <template> 

1  A  ESC  LUTE  (  <r.  umber)  )  $sp 
<declar a t ion  type> 

$sp  <identif ier>  $tv 

<proqr arc  naiae>  :  :  =  program  $sp  <prccedure  identi  £ier> 

<scope  block>  Executable  statement  list> 

|  <def  ir:  it  ion>  ;  $nl  <.ccope  blcck> 

!  <opsn  statement >  ;  $rl  Eccpe  block> 

<open  st atement >  -  OPEN  Ssp  <storage  reference 

Executable  stateirent  list>  Executable  staterrent> 

|  <executatle  statement  list>  ;  Sri 
<executable  state m e n t > 

Executable  statement>  :  :  =  Escape>  <label  part>  Ecrdition> 

< x 1 1 h  part> 

}  RETURN  <ficn:  pait>  <conditicn> 

Eith  part> 

|  <selectcr>  ;  din  Sn  1  <alternatives> 

Sex  $nl  END 
|  <ccirpcund> 

|  <tuple  elements 
}  ASSERT  Ssp  <tuple  eleir.ent> 

I 

<escape>  : :=  EXIT 

1  FEAT 


<label  p ar t >  : 


<conditicn>  :: 

I 

I 

<with  part>  : : 


<f rom  pa r t>  : : 


$sp  <  Exit  identified  > 


Ssp 

UNIE 

SS  $sp  <expr ession> 

dsp 

WHEN 

$sp  < ex pres si on > 

.1  sp 

WITH 

<  t  u  p  1  e  > 

Ss  p 

FROM 

<procedure  identifier > 

<selectoi:>  IE  $sp  <expressicn> 

|  CASE  Ssp  <index  type>  Ssp  TAG  $sp  <expressicn> 

<alterna  ti  ves>  ::=  <else>  <executable  statement  li.st> 

|  <alte i native  la be  is >  Executable  statement  list> 
|  <al te rnative  labels>  Executable  statement  list> 
;  $  n  1  <  a  1 1  e  r  r;  a  t  i  v  e  s  > 
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Alternative  labels>  Clabeli 

j  Alter  native  1  a  b  e  1  s  i  <label> 

Ccompoundi  E I  GIN  $in  Ini  $s 1  Cscope  blccki  $el  lex  $nl  END 

|  CYCLE' $ in  Ini  Cexecutable  statement  list>  Sex 
Si:  1  END 

|  Cdo  head>  ;  $in  $nl  Cexecutable  statement  list> 

Sex  $ n 1  Z  N D 

|  <exit  label>  Sr.  1  Ccompo  und>  <  <exit  identi  fieri  > 

<do  head>  ::=  DC  Ssp  < variable  i dent  if ier >  :  ;  =  Ateraticn  contrcii 

<iteraticn  contrcii  Ctuplei 

j  <ex pres sic a>  $ s p  TC  Is p  <expressicn> 

|  Cexpr essioni  $sp  DCN13TC  $sp  <expression> 

{  EACH  $sp  < index  type> 

<exit  lafceli  : <  Cidentifieri  $te  > 


Ctuplei 


<  tuple  eleirenti 

<t.uple>  ,  Ssp  <tuple  elen  on i  > 


Cvaluei  Cexpressioni 

j  Atorage  referencei  <value> 


<expression>  <logical  term) 

j  <expressicn >  $sp  <oi  cperatcr)  $sp  <lcgical  factori 
i  S kb  ENTER  FFCE  KEYBOARD 


Clogical  terrri  ;  Clogical  factori 

|  <logical  term>  is  p  S  Ssp  clogical  factori 

Clogical  factori  : Cstring  expression) 

|  Clogical  factor)  $sp  < relational  opera  tori  $sp 
Cstring  expressiori 


Cstring  expressioni  ::=  Cnuireric  expressiori 

j  Cstring  expressioni  $ s p  ||  Ssp 
Cnuireric  expressioni 


Cnuireric  expressioni  Cnumeric  terrr> 

j  Cnumeric  expressioni  Ca dding  operatori 
Cnumeric  termi 

Cnumeric  teriri  Cnumeric  factori 

)  -i  Cnumeric  factori 

j  Cad  ding  operatori  Cr :  urreric  factori 
|  Cnumeric  termi  Ciriulti.pl  ying  operatori 
Cnumeric  factori 


Cnumeric  factori  Cstorage  referencei 

|  Cunsigned  constanti 
|  (  Ct  uplei  ) 


<storage  referenced  <variatle  identifier'-? 

|  <procedure  identifierd 
|  <storage  refers need  a 

j  <storage  referenced  .  <variable  identifi.er> 

1  Cstorage  referenced  (  ctupled  ) 

j  TYPED  Ssp  <hrief  typed  (  <tuple>  ) 

|  <escape  typed  ;  Ini  <ccmrcundd 


<escape  typed 


EXITS  $sp  WITH  (  <type  listd  ) 


<or  operatord  | 

|  X  0  E 


Crelaticr.al  opera  ford 


<adding  operatcrd  ;:=  + 


< 

d 

“1  — 
<  = 
d  = 


Multiplying  operatord  * 

1  / 

|  $sp  MOD  Ssp 

<idontifierd  : Sqi 
<ccmpound  type  identifierd  lit 

<index  type  identifierd  ::=  Sit 
<constant  identifierd  $lc 

<procedure  identifierd  $lp 

<exit  identifierd  $la 

<variatle  identifierd  $lv 

<n un herd  : Sen 
<str in gd  : : =  $cs 
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